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Practical  Issues  in  the 
Complexity  of  Neural  Networks: 

Final  Technical  Report 

Ian  Parberry 
Piotr  Berman 
Georg  Schnitger 

Dept,  of  Computer  Science 
333  Whitmore  Laboratory 
Penn  State  University 
University  Park,  Pa  16802. 

1.  Research  Objectives 

One  criticism  of  theoretical  neural  network  research  is  that  the  results  are  of  asymp¬ 
totic  interest  only,  that  is,  the  “hidden”  constant  multiples  in  the  resource  analysis 
make  it  not  of  immediate  practical  utility.  Our  objectives  were  to  leaven  the  theoreti¬ 
cal  work  done  under  Grant  AFOSR-87-0400  with  practical  experience.  Experiments 
were  also  done  in  order  to  generate  open  questions,  hypotheses,  and  conjectures  which 
are  the  subject  of  on-going  research. 


2.  Accomplishments 


2.1.  Synaptic  Weights 

The  discrete  neural  network  model  uses  neurons  which  compute  a  function  of  the  form 


f:B-»B  (where  B  denotes  the  Boolean  set  {0,1}),  such  that 


/  -y- ,  ^  Non  For 

f(x1,...,xn)=g(Xwixi),  l  \%  hnl 

i=l  1 

>eed 

where  g  is  the  step  function  g:R-»B  defined  by  g(x)=l  iff  x£0,  and  wj,...,wneR.  ,tlon_ 


Muroga  et  al  [1]  showed  that  the  weights  can  be  made  integers  bounded  above  in  - 

at loo/ _ 

magnitude  by  Ofn"72).  Their  proof  reduces  to  finding  the  maximum  value  of  the  ratio  ^illty  Codas 

all  and/or 

of  the  determinants  of  two  zero-one  matrices,  where  the  numerator  is  equal  to  the  Sp*clai 
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denominator  with  one  column  replaced  by  a  zero-one  vector.  They  ignore  the  denomi¬ 
nator  and  bound  the  numerator.  We  are  currently  trying  to  improve  this  upper-bound, 
or  to  find  a  matching  lower-bound.  The  only  progress  that  we  have  made  so  far  is  to 
extend  the  result  to  non-Boolean  domains. 

We  have  run  experiments  that  lead  us  to  believe  that  a  better  worst-case  upper- 
bound  is  CKn1^4),  although  2°^n)  is  still  possible.  Preliminary  theoretical  results  indi¬ 
cate  that  the  average  synaptic  weight  is  very  small.  Our  experiments  have  verified  that 
this  is  true  even  for  small  values  of  n,  indicating  that  the  theory  in  this  case  is  immedi¬ 
ately  applicable  in  practice,  not  just  for  asymptotically  large  values  of  n  that  would  be 
impossible  to  reach  with  current  technology. 

Research  on  this  subject  is  still  in  progress.  We  are  currently  writing  a  more 
efficient  program  which  will  enable  us  to  evaluate  the  ratio  of  the  determinants  for 
larger  values  of  n.  The  initial  program  was  written  in  Pascal,  and  consisted  of  a 
multi-word  arithmetic  package,  a  rational  arithmetic  package,  a  matrix  manipulation 
package,  and  a  main  body  which  picked  a  large  number  of  random  matrices  and  vec¬ 
tors,  and  evaluated  the  ratio  of  the  determinants  described  above.  The  program  was 
hand-coded,  without  the  use  of  any  proprietary  software  or  libraries.  We  plan  to 
translate  the  program  to  gcc  to  improve  its  speed.  Result,  v  ’1  be  reported  in  [1]. 

A  better  upper-bound  for  the  synaptic  weights  would  lead  to  improved  upper- 
bounds  for  many  problems,  such  as  integer  addition,  and  multiplication.  These  func¬ 
tions  can  be  computed  in  constant  depth  and  polynomial  size  by  a  discrete  neural  net¬ 
work;  the  size  would  be  reduced  by  a  polylog  factor.  It  would  also  reduce  the  size 
required  for  the  simulation  of  a  discrete  neural  network  by  one  with  unit  weights. 
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2.2.  Convergence  of  Learning  Algorithms 

We  investigated  the  speed  of  convergence  for  distributed  learning  algorithms  learning  a 
threshold  function  on  a  single  threshold  gate.  Among  the  considered  learning  algo¬ 
rithms  are  Perception  learning,  Littlestone’s  Winnow  algorithm,  the  delta  rule  and  the 
generalized  delta  rule.  The  results  so  far  are  inconclusive,  but  indicate  slow  conver¬ 
gence  of  all  mentioned  procedures  in  general.  The  emphasis  of  further  research  will 
be  on  finding  specific  examples  on  which  Littlestone’s  procedure  performs  poorly  (his 
procedure  will  leam  thresholds  with  small  weights  and  relatively  large  separation). 

2.3.  Analog  Neural  Networks 

The  equipment  was  also  used  for  the  design  and  testing  of  some  of  the  neural  circuits 
reported  in  [2,3].  These  papers  analyze  the  complexity  of  computation  and  learning 
in  analog  neural  networks  with  limited  precision. 

2.4.  Simulated  Annealing 

There  has  been  much  interest  recently  in  using  neural  networks  to  solve  optimization 
problems  stochasticall)  using  Simulated  Annealing  (for  example,  the  Boltzmann 
machine).  It  is  not  clear  that  Simulated  Annealing  does  well  on  examples  generated  in 
practice.  In  a  class  on  Computer  Aided  Design  for  VLSI,  students  did  research  on 
Simulated  Annealing  and  on  general  purpose  routing  procedures  (line  expansion,  com¬ 
putational  geometry  approaches).  The  emphasis  in  the  simulated  annealing  project  was 
a  comparison  of  Simulated  Annealing  with  special  purpose  procedures  (Kemighan-Lin; 
Leighton-Rao)  for  Graph  Bisection.  Whereas  it  is  known  that  Simulated  Annealing 
performs  well  (given  sufficient  time)  for  random  graphs,  little  work  has  apparently 
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been  done  for  highly  structured  graphs.  The  results  of  the  project  concern  the  hyper¬ 
cube,  cube-connected  cycles  and  weakly-connected  graphs  (a  singly  interconnected  col¬ 
lection  of  rings).  Simulated  Annealing  did  not  perform  significantly  better  than  a  ran¬ 
domized  version  of  Kemighan-Lin,  but  used  significantly  more  time.  We  hope  to 
extend  this  study  later  to  cover  a  broader  class  of  highly  structured  graphs. 

2.5.  Distributed  Data  Bases 

In  a  replicated  data  base  one  can  increase  availability  of  data  by  allowing  the  retrieval 
or  modification  of  the  data  even  if  not  all  copies  are  accessible.  This  way  transactions 
may  be  performed  even  when  some  nodes  (computers)  or  links  are  not  operational. 
The  problem  is  to  maximize  the  availability  while  preventing  the  existence  if  incon¬ 
sistent  copies  of  the  data  (preserving  the  integrity  of  the  replicated  data  base). 

One  scheme  is  to  assign  each  node  a  weight  (number  of  votes  according)  to  its 
reliability  and  importance  for  the  network,  and  perform  transactions  only  if  the  nodes 
possessing  the  absolute  majority  of  the  votes  are  available.  In  general,  there  exist  net¬ 
works  where  the  voting  scheme  cannot  yield  optimal  availability.  The  open  question, 
which  was  subject  of  extensive  testing,  is  the  following:  can  one  use  the  voting 
scheme  to  approximate  well  the  optimal  schema  of  replica  control?  The  results  sug¬ 
gest  that  the  the  voting  scheme  is  not  significantly  inferior  from  the  optimal  solution 
(while  always  much  simpler  and  easier  to  implement). 

The  second  subject  of  testing  were  heuristics  to  learn  an  optimal  or  near-optimal 
assignment  of  votes,  using  the  methodology  of  neural  networks.  Certain  families  of 
special  cases  were  used  as  benchmark.  Regretfully,  no  heuristic  handled  all  bench¬ 
mark  cases  in  a  satisfactory  manner.  The  difficulty  is  that,  unlike  in  a  classic  neural 
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network,  the  heuristic  do  not  have  precise  feedback.  The  empirical  vote  assignment 
should  result  in  allowing  transactions  with  maximal  possible  frequency,  subject  to 
integrity  constraints.  However,  the  mere  fact  that  a  transaction  is  not  allowed  is  not  a 
sufficient  basis  for  a  negative  feedback.  We  are  still  trying  to  find  a  proper  correction 
for  negative  feedbacks  and  a  proper  regime  of  weight  updates. 

The  more  theoretical  part  of  this  research  resulted  in  a  technical  report  “Voting 
and  Other  Static  Schemes  for  Managing  Replicated  Data  Bases”  P.  Berman,  M.  Obra- 
dovic,  Technical  Report  CS-89-46,  Department  of  Computer  Science,  Penn  State 
University,  1989. 

2.6.  Related  Research 

In  a  class  on  Computer  Graphics  and  Computational  Geometry,  the  equipment  was 
used  mainly  to  experiment  with  fractals.  This  included  work  on  rewriting  systems, 
Julia  sets  and  the  Mandelbrot  as  well  as  random  fractals  for  modeling  mountains.  We 
would  have  been  unable  to  perform  these  projects  with  departmental  equipment  due  to 
the  computation  intensive  algorithms,  particularly  for  Julia  sets  and  the  Mandelbrot 
Set.  Besides  enabling  us  to  give  proper  instruction,  these  projects  also  brought  up 
research  questions  that  we  intend  to  work  on  in  the  future.  These  questions  are  mainly 
concerned  with  the  computational  complexity  of  algorithms  for  fractals,  an  area  that 
apparently  received  little  attention  in  the  past. 

3.  Conclusion 

The  equipment  grant  was  used  to  purchase  a  moderate-cost  state-of-the-art  computing 
environment  for  the  Principal  Investigators  and  Research  Assistants  of  Grant  AFOSR- 
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87-0400.  In  addition  to  the  experiments  performed  in  conjunction  with  that  research, 
the  equipment  provided  the  daily  computing  environment  for  the  researchers,  perform¬ 
ing  such  necessary  tasks  as  electronic  mail  communication  with  researchers  at  other 
Institutions,  and  the  typesetting  of  research  publications.  The  Department  of  Computer 
Science  at  Penn  State  University  was  and  is  financially  unable  to  provide  adequate 
computing  facilities  to  faculty.  The  equipment  has  done  much  to  increase  the  produc¬ 
tivity  and  quality  of  the  research  of  all  concerned.  These  secondary  benefits  are  not  to 
be  ignored. 
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